Distributed Data Mining in the Grid Environment
نویسندگان
چکیده
Grid computing has emerged as an important new branch of distributed computing focused on large-scale resource sharing and high-performance orientation. In many applications, it is necessary to perform the analysis of very large data sets. The data are often large, geographically distributed and it’s complexity is increasing. In these area grid technologies provides effective computational support for applications such as knowledge discovery. This paper is an introduction to Grid infrastructure and its potential for machine learning tasks.
منابع مشابه
Workflow-based Tasks Scheduling on Grid
Due to the distributed nature of data and the need for high performance, it makes Grid a suitable environment for distributed data mining. Since distributed data mining applications are typically data intensive, one of the main requirements of such a DDM Grid environment is the efficient workflow scheduling. We propose an architecture for a Knowledge Grid scheduler that results in the minimal r...
متن کاملA Grid Data Mining Architecture for Learning Classifier Systems
Recently, there is a growing interest among the researchers and software developers in exploring Learning Classifier System (LCS) implemented in parallel and distributed grid structure for data mining, due to its practical applications. The paper highlights the some aspects of the LCS and studying the competitive data mining model with homogeneous data. In order to establish more efficient dist...
متن کاملA Survey of Dynamic Replication Strategies for Improving Response Time in Data Grid Environment
Large-scale data management is a critical problem in a distributed system such as cloud,P2P system, World Wide Web (WWW), and Data Grid. One of the effective solutions is data replicationtechnique, which efficiently reduces the cost of communication and improves the data reliability andresponse time. Various replication methods can be proposed depending on when, where, and howreplicas are gener...
متن کاملDistributed data mining in grid computing environments
The computing-intensive data mining for inherently Internet-wide distributed data, referred as Distributed Data Mining (DDM), calls for the support of a powerful Grid with an effective scheduling framework. DDM often shares the computing paradigm of local processing and global synthesizing. It involves every phase of Data Mining (DM) processes, which makes the workflow of DDM very complex and c...
متن کاملApplying Grid Technologies to Distributed Data Mining
The Grid promises improvements in the effectiveness with which global businesses are managed, if it enables distributed expertise to be efficiently applied to the analysis of distributed data. We report an ESRC-funded collaboration between EPCC in Edinburgh and Curtin University of Technology in Perth, Australia, that is applying public-domain Grid technologies to secure data mining within a co...
متن کاملA New Job Scheduling in Data Grid Environment Based on Data and Computational Resource Availability
Data Grid is an infrastructure that controls huge amount of data files, and provides intensive computational resources across geographically distributed collaboration. The heterogeneity and geographic dispersion of grid resources and applications place some complex problems such as job scheduling. Most existing scheduling algorithms in Grids only focus on one kind of Grid jobs which can be data...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012